--- title: "Class Exercise 4" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## R Graphics Don Edwards' coverage of graphics capabilities was quite thorough. In this exercise, I am going to demonstrate some additional capabilities of the graphics package. The example is drawn from a well-monitoring project that our statistical consulting center, the Stat Lab, helped analyze. As part of the Exploratory Data Analysis for this project, we plot analyte concentrations (in this case, periodic table elements like Zinc, Lead, Arsenic, etc) over time. New samples are collected every six months. Download the Exercise 4 R workspace from the website and look at the Barium concentrations for Well 1: ```{r eval=FALSE} load("Exercise_4.RData") Ba.540 ``` Note that we have a missing value. We will next look at commands to generate a scatterplot of the data for Well 1 and then add a smoothed trend curve; we have to use a smoothing function that can handle missing values, so there are a couple extra steps here. We are using methods here you may not be familiar with; `smooth.Ba` is the output object from the `loess()` function, in which we are *smoothing* the response data `Ba.540` as a function of `Time`. The `lines()` command actually creates the smoothed curve on the already-existing plot. ```{r eval=FALSE} plot(Ba.540) title("Standardized Barium Concentrations (ppm)") Time.seq=1:36 smooth.Ba=loess(Ba.540~Time.seq,na.action=na.omit) lines(1:36,predict(smooth.Ba,1:36)) ``` This graph is a little disappointing; e.g., note that it didn't use `names(Ba.540)` to generate an attractive set of x-axis labels. We are going to wipe out the current axes, and then add axis features in separate steps to generate more attractive labels. Be sure to pay attention to your plotting window as you complete each step; it's interesting to observe the features as they are added. ```{r eval=FALSE} plot(Ba.540,xlab="Period",ylab="Concentration",axes=F) title("Standardized Barium Concentrations (ppm)" ) lines(1:36,predict(smooth.Ba,1:36)) ``` This looks a little odd right now. In the additional set of commands, we first restore the vertical axis (the argument `2` in `axis(2)` refers to the *vertical*--or second--axis). Then we add minor tick marks to the horizontal axis (`1`), but without labels. Finally, we add labels and major tick marks to the horizontal axis; note that we use a subsetting option (`{seq(6,36,6)`) to hand-select the labels for the major tick marks from the vector `names(Ba.540)`. ```{r eval=FALSE} plot(Ba.540,xlab="Period",ylab="Concentration",axes=F) title("Standardized Barium Concentrations (ppm)" ) lines(1:36,predict(smooth.Ba,1:36)) axis(2) axis(1,at=1:36,labels=F,tck=-0.02) axis(1,at=seq(6,36,6),labels=names(Ba.540)[seq(6,36,6)],tck=-0.04,cex.axis=0.7) ``` Sometimes we have to play around with the sequencing on the horizontal axis labels. If the labels are *crowded*, we may not get the sequencing we asked for; here, we used `cex.axis` to shrink the labels so they would not be crowded. The labels refer to the semiannual data collection times--`1H05` represents data collected in the first half of 2005, for instance. We can also save our graph as a pdf file. I know you're thinking that we could just click on **Export** and choose, e.g., **Save as Image..**, but what if we've embedded the graphing commands in a function or a loop? In that case, we will need to know about the following approach to saving files. It's rather straightforward; simply use the **pdf** command with a file specification, then enter your plotting commands. All commands appearing after the `pdf` statement will be executed, though you will observe no changes to your plotting window. Execution can be halted either by opening a new pdf file, or by *closing* the current file with the simple command `dev.off()`. Here is how I would store the graph I created above; it will be saved in the working directory. ```{r eval=FALSE} pdf("Barium.pdf") plot(Ba.540,xlab="Period",ylab="Concentration",axes=F) title("Standardized Barium Concentrations (ppm)" ) lines(1:36,predict(smooth.Ba,1:36)) axis(2) axis(1,at=1:36,labels=F,tck=-0.02) axis(1,at=seq(6,36,6),labels=names(Ba.540)[seq(6,36,6)],tck=-0.04,cex.axis=0.7) dev.off() ``` Inspect the file and comment; are there any additional changes you might suggest? We can use a similar approach to save postscript files to embed in documents. ## Tidyverse We will now use `ggplot2` to generate a similar plot. We first read `Ba.540` into a data frame and save its labels as the variable `Period`, and then add a variable `Index` as a numerical plotting index and rename `Ba.540` as `Barium`. The first `ggplot` statement looks reasonable, but the x axis could have better labels (we'll talk about `aes` later in the course--it stands for *aesthetics*. The second `ggplot` statement generates labels that are too crowded and has other issues, as we'll see next. Just as before, we need to add a smoothing curve and improve those labels. ```{r eval=FALSE} library(tidyverse) CE4=as.data.frame(Ba.540)%>%rownames_to_column("Period") CE4=CE4%>%mutate(Index=1:36)%>%rename(Barium=Ba.540) ggplot(CE4,aes(Index,Barium))+geom_point() ggplot(CE4,aes(Period,Barium))+geom_point() ``` Trying to use `Period` directly as our $x$ variable is a mistake since it is not numeric and hence smoothing functions would not work with it properly; we need to use `Index` as our $x$ variable. We create our label variable `xlab` beforehand; I was a little disappointed I couldn't create it "on the run" in `ggplot`. Note that `ggplot2` has a function--`scale_x_continuous`--that creates a subset of major axis marks for our $x$ axis; its sequence needs to match the sequence used to create `xlab`. ```{r eval=FALSE} xlab=CE4$Period[seq(5,35,by=5)] ggplot(CE4,aes(Index,Barium))+geom_point()+geom_smooth()+labs(x='Period')+ scale_x_continuous(breaks=seq(5,35,by=5),labels=xlab) #+labs(x=seq(Period,by=4)) ``` Note that `ggplot` creates a confidence band for the smoothed curve by default. We can remove it by adding the option `se=FALSE` to the `geom_smooth()` function.